Student Name & ID: ARJUN KC (8773456)¶

Importing some important libraries of Python for Machine Learning¶

In [11]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import plotly 
import plotly.express as px
plotly.offline.init_notebook_mode()
sns.set_style("whitegrid")

Info about dataset present in Seaborn Repository¶

In [12]:
arj = sns.get_dataset_names()
print(arj,end="")
['anagrams', 'anscombe', 'attention', 'brain_networks', 'car_crashes', 'diamonds', 'dots', 'dowjones', 'exercise', 'flights', 'fmri', 'geyser', 'glue', 'healthexp', 'iris', 'mpg', 'penguins', 'planets', 'seaice', 'taxis', 'tips', 'titanic']

Now we are using one of the Seaborn Repository data i.e. "IRIS" dataset¶

In [13]:
from IPython.display import Image
Image(url='https://www.tensorflow.org/images/iris_three_species.jpg')
Out[13]:
In [14]:
data = sns.load_dataset("iris")
data.head()
Out[14]:
sepal_length sepal_width petal_length petal_width species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa

--> head() function give top five data from IRIS dataset¶

In [15]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   sepal_length  150 non-null    float64
 1   sepal_width   150 non-null    float64
 2   petal_length  150 non-null    float64
 3   petal_width   150 non-null    float64
 4   species       150 non-null    object 
dtypes: float64(4), object(1)
memory usage: 6.0+ KB

--> info() function give the brief information about dataset¶

In [16]:
data["species"].value_counts()
Out[16]:
species
setosa        50
versicolor    50
virginica     50
Name: count, dtype: int64

-->values_counts() function return a Series containing counts of unique values.¶

Let's visualize the IRIS data using Matplotlib¶

In [17]:
plt.figure(figsize=(10,8))
ax1 = data[data.species=='setosa'].plot.scatter(x='sepal_length', y='sepal_width', label='Setosa')
ax2 = data[data.species=='versicolor'].plot.scatter(x='sepal_length', y='sepal_width', color='red', label='Versicolor')
ax3 = data[data.species=='virginica'].plot.scatter(x='sepal_length', y='sepal_width', color='orange', label='Virginica')
ax1.set_xlabel("Sepal Length")
ax1.set_ylabel("Sepal Width")
ax2.set_xlabel("Sepal Length")
ax2.set_ylabel("Sepal Width")
ax3.set_xlabel("Sepal Length")
ax3.set_ylabel("Sepal Width")
ax1.set_title("Relationship of Sepal Length and Sepal Width on Setosa")
ax2.set_title("Relationship of Sepal Length and Sepal Width on Versicolor")
ax3.set_title("Relationship of Sepal Length and Sepal Width on Virginica")

plt.show()
<Figure size 1000x800 with 0 Axes>
In [18]:
data.plot(kind="density",figsize=(6.5,5))
plt.show()

Let's visualize the IRIS data using Seaborn¶

In [19]:
plt.figure(figsize=(10,10))
sns.boxplot(x="sepal_length",y= "sepal_width", hue= "species",palette=["r","g","b"],data= data)
Out[19]:
<Axes: xlabel='sepal_length', ylabel='sepal_width'>

Let's visualize the IRIS data using Plotly¶

In [20]:
fig = px.scatter_3d(data,x= "sepal_length", y= "sepal_width",z = "petal_length",size= "petal_width", color="species")
fig.show()
In [ ]: